1 Introduction

1.1 Create Time Series

What is the expected period (time period of repetition) and the time step for the following timeseries:

dat = floor(runif(n= 1000, min = 1, max= 2000))
  1. Sunshine duration per month in Basel from 1990 to 2000
ts(data = dat, start=c(1990,1),end=c(2000,12), frequency = 12)
  1. Number of newborn babies in the city of Zurich per year from 2000 to 2011
ts(data=dat, start=2000,end=2011,frequency = 1)

c)Number of reservations in a restaurant for every night during 4 weeks

ts(data=dat, start=1,end=4,frequency = 7)
  1. Water runoff of a river. The data has been collected every day for 4 years
ts(data=dat, start=1,end=4,frequency = 365)

1.2 Simulate Time Series

zeitsim <- read.table('https://stat.ethz.ch/Teaching/Datasets/WBL/zeitsim.dat')
zeitsim.ts <- ts(zeitsim[,1:4], start = 1, frequency = 1)

Simulate timeseries according to the following models:

set.seed(100)
Et <- ts(rnorm(101, 0, 1))
Et [1] <- 0
y1 <- 0

for (i in 2:length(Et)) {
  y1[i] <- Et[i]
  }

y1 <- y1[2:length(y1)]
ts.y1 <- ts(y1)

a) Y1: Yt = Et − 0.5 · Et−1 , where Et ∼ N(0, 1) i.i.d. E0 = 0

set.seed(100)
Et <- ts(rnorm(101,0,1))
Et [1] <- 0
y1 <- 0
for (i in 2:length(Et)) {
  y1 [i] <- Et[i] - 0.5 * Et[i-1]
}

y1 <- y1[2:length(y1)]
ts.y1 <- ts(y1)

par(mfrow= c(1,2))
plot(ts.y1, lwd = 1.5, main = 'Y1: Yt = Et − 0.5 · Et−1 , where Et ∼ N(0, 1) i.i.d. E0 = 0')
plot(zeitsim.ts[,1], main = 'Zeitsim Y1', lwd = 1.5)

Answer:

  • stationary time series

  • MA(1) process

b)Y2: Yt = Yt−1 + Et, where Et ∼ N(0, 1) i.i.d. Y0 = 0

set.seed(100)
Et <- ts(rnorm(100,0,1))
y2 <- 0
for (i in 2:length(Et)) {
  y2[i] <- y2[i-1] + Et[i]
}

ts.y2 <- ts(y2)

par(mfrow = c(1,2))
plot(ts.y2, main = 'Y2: Yt = Yt−1 + Et, where Et ∼ N(0, 1) i.i.d. Y0 = 0', lwd = 1.5)
plot(zeitsim.ts[,2], main = 'Zeitsim Y2', lwd = 1.5)

Answer:

  • non stationary time series

  • Yt-1 and Et are uncorrelated random variables

c) Y3: Yt = 0.5 · Yt−1 + Et, where Et ∼ N(0, 1) i.i.d. Y0 = 0

set.seed(100)
Et <- ts(rnorm(100,0,1))
y3 <- 0
for (i in 2:length(Et)) {
  y3[i] <- 0.5 * y3[i - 1] + Et[i]
}

ts.y3 <- ts(y3)

par(mfrow = c(1,2))
plot(ts.y3, lwd = 1.5, main= 'Y3: Yt = 0.5 · Yt−1 + Et, where Et ∼ N(0, 1) i.i.d. Y0 = 0')
plot(zeitsim.ts[,3], main = 'Zeitsim Y3', lwd = 1.5)

Answer:

  • stationary time series

  • AR(1) process

d) Y4: Yt = Yt−1 · Et, where Et ∼ U(0.95, 1.05) i.i.d. Y0 = 1

set.seed(100)
Et <- ts(runif(100,0.95, 1.05))
y4 <- 1
for (i in 2:length(Et)) {
  y4[i] <- y4[i-1] * Et[i]
}

ts.y4 <- ts(y4)

par(mfrow= c(1,2))
ts.plot(ts.y4, main ='Y4: Yt = Yt−1 · Et, where Et ∼ U(0.95, 1.05) i.i.d. Y0 = 1', lwd = 1.5)
plot(zeitsim.ts[,4], main = 'Zeitsim Y4', lwd = 1.5)

Answer:

  • non stationary time series, as Var(Yt) is NOT constant

2 Transformation

  • Linear transformation (mostly for unit/range conversion)

  • Log transformation (to reduce right-skewedness in data distribution)

2.1 Stationarity

A stationary time series is one whose properties do not depend on the time at which the series is observed. Thus, time series with trends, or with seasonality, are not stationary — the trend and seasonality will affect the value of the time series at different times. A white noise series is stationary — it does not matter when you observe it, it should look much the same at any point in time.

Some cases can be confusing — a time series with cyclic behaviour (but with no trend or seasonality) is stationary. This is because the cycles are not of a fixed length, so before we observe the series we cannot be sure where the peaks and troughs of the cycles will be.

In general, a stationary time series will have no predictable patterns in the long-term. Time plots will show the series to be roughly horizontal (although some cyclic behaviour is possible), with constant variance.

2.1.1 Testing Stationarity

  • Only the time series process can be tested for stationarity not the realization

  • Augmented Dickey-Fuller test

  • Phillips-Perron Test

  • KPSS Test

  • Elliot Rothenberg-Stock

  • Schmidt-Phillips

  • Zivot-Andrews

2.2 Skewness

The direction of the skew is dependent on whether the bulk of the material is above the mean line (negative skew) or below the mean line (positive skew).

2.3 Linear Transformation

Yt = mXt + b with m and b as constants. Used for conversion of units:

CHF to EURO Km to meters Joule to Kilowatt-hours

Linear transformations only scale key properties such as autocorrelations, models and forecasts.

2.4 Log Transformation

gt = log(Xt)

Used for right-skewed time series distributation or if the data show variation that increases or decreases with the level of the series, then a transformation can be useful. Logarithms can help to stabilise the variance of a time series. For example, if we denote the original observations as y1,…,yT and the transformed observations as w1,…,wT, then wt=log(yt). Logarithms are interpretable: changes in a log value are relative (or percentage) changes on the original scale. So if log base 10 is used, then an increase of 1 on the log scale corresponds to a multiplication of 10 on the original scale. Another useful feature of log transformations is that they constrain the forecasts to stay positive on the original scale.

Example (log is the logarithm to the base 10), i.e.  log(1) = 0 log(10) = 1 log(100) =2 log(1000)=3 and so forth.

This means that the logarithm squezes all values between 1 and 10 to the range between 0 and 1, all values between 10 and 100 to the range between 1 and 2, and all values between 100 and 1000 to the range between 2 and 3. In conclusion, the higher the values get, the more the logarithm squeezes them together.

3 Methods for trend determination/elimination

  • (linear) fitting, e.g. with ordinary least square fitting

  • (Backwards) differencing (higher order differences eliminate higher order polynomial trends)

  • filtration

Advantages and drawbacks of the individual procedures

3.1 Linear Regression

Detrending with linear regression it is possible if the trend is deterministic (e.g. a linear trend). You run a linear regression to estimate the trend and remove it from the data.

In deterministic models, the output of the model is fully determined by the parameter values and the initial conditions.

Stochastic models possess some inherent randomness. The same set of parameter values and initial conditions will lead to an ensemble of different outputs.

If the trend is stochastic you should detrend the series by taking first differences on it.

3.2 Differencing

One way to make a non-stationary time series stationary — compute the differences between consecutive observations. Computing the differences between consecutive observations, is known as differencing.

Differencing can help stabilise the mean of a time series by removing changes in the level of a time series, and therefore eliminating (or reducing) trend and seasonality.

Connection to calculus

Summary: + trend and seasonal effect can be removed + procedure is very quick and very simple to implement

  • trend, season and remainder component are NOT known, and cannot be visualised
  • resulting time series will be shorter than original data set
  • differencing creates artificial new dependencies that are different from the original ones
  • extrapolation of trend, season and remainder is not easily possible

3.3 Random Walk

3.4 Second-order differencing

3.4.1 Taylor’s Theorem

knitr::include_graphics('taylor.png')

3.5 Seasonal differencing

Seasonality manifests itself by a repetitive behaviour after a fixed lag time p.

Yt = Xt - Xt-p = (1 - B^p)Xt

Note: (1 - B^p) is NOT (1 - B)^p

Example of time series representation with seasonality

Example of time series representation with seasonality

Calculation of the time series with seasonality

3.6 Backshift operator and Backshift notation

3.6.1 Remove linear Trend of a non-stationary time series by taking differences of first order at lag 1: diff(x, differences = 1)

The new time series Yt is going to be stationary, but has new, strong and artificial dependencies

  • Yt now are the observation to observation changes in the series, NOT longer the observations or remainder itself
  • does not yield estimates for trend, season and remainder component

3.6.2 Differencing for log-transformed series: diff(log(x), differences = 1, lag = 1)

3.6.3 Remove polynomial Trend: diff(x, differences = 2 or more)

This will remove a polynomial trend. A quadratic trend can be removed by taking second-order differences (differences = 2).

Time series with seasonal effects can be made stationary through differencing by comparing to the previous periods’ value.

3.6.4 Remove Seasonality: diff(x, lag = p)

To remove Trend and Seasonality take first order differences at the lag = p of the period, which removes both trend and season effects: diff(x, differences = 1, lag = 12) example: p = 12

3.7 Filtration to reduce noise impact

Differencing and filtering is not the same!!

Filtering is used to reduce the noise element, by averaging multiple (e.g. 3) measurements, short-term variations are smoothed out and the trend is confirmed -> We obtain a new averaged time series.

In case of seasonality effects, the has to expand over the full period to extract the trend and to quantify the seasonal impact, the values of the same time in the season have to be averaged.

3.7.1 Additive linear filters

Example of filter aplication

3.7.2 Exponential smoothing filter

3.7.3 Advantages and Drawbacks

Soomthing, filtering:

  • trend and seasonal effect can be estimated

  • trend, season and remainder are explicitly known & can be visualized

  • procedure is transparent and simple to implement

  • resulting time series will be shorter than the original

  • the running mean is not the very best smoother

  • extraolation of trend and season are not entirely obvious

  • seasonal effect is handled as constant over time

3.8 Decomposition

The aim of time series analysis is to reduce the series to stationarity.

3.8.1 Basic Structures

The following two structures are considered for basic decomposition models: Additive: Xt = Trend + Seasonal + Random Multiplicative: Xt = Trend * Seasonal * Random

How to Choose Between Additive and Multiplicative Decompositions - The additive model is useful when the seasonal variation is relatively constant over time. - The multiplicative model is useful when the seasonal variation increases over time.

The additive decomposition is the most appropriate if the magnitude of the seasonal fluctuations, or the variation around the trend-cycle, does not vary with the level of the time series.

When the variation in the seasonal pattern, or the variation around the trend-cycle, appears to be proportional to the level of the time series, then a multiplicative decomposition is more appropriate. Multiplicative decompositions are common with economic time series.

An alternative to using a multiplicative decomposition is to first transform the data until the variation in the series appears to be stable over time, then use an additive decomposition. When a log transformation has been used, this is equivalent to using a multiplicative decomposition because:

3.9 Identify characteristics of Time Series

Have a look at the following set of time series. For every series, shortly answer the following questions (motivate your answers): • Is the time series stationary? • Is there a trend? • Can one find some seasonal effect? If yes, what is the period? • Which transformation should be applied (if required)?

Answer: - non stationary - non linear trend, mostly increasing - no seasonal component - log transformation

Answer: - seems stationary - trend from before was removed by diff(difference = 1) - constant variance achieved by taking the logarithm. However Log-Returns are NOT stationary, because of volatility clusters

Answer: - non stationary - exponential increasing trend - seasonal component with period 1 year - increasing variance, log transformation benefitial to get additive components

Answer: - non stationary - increasing trend - seasonal component with period 1 year and increasing variance - increasing variance, log transformation benefitial to get additibe components

Answer: - non stationary - no visible trend - seasonal effect with period 1 year

Yeary numbers of sunspots:

Answer: - likely to be stationary - no trend - no seasonal effect - periodicity with period of 11 years - variance maybe different over time - log transformation required due to right skewness

Answer: - probably stationary - no clear trend visible - no seasonal effect visible - no transformation required

4 Quantitative Description and Correlation functions

Autocorrelation measures the linear relationship between lagged values of a time series and quantifies dependencies. There are several autocorrelation coefficients, corresponding to each panel in the lag plot. For example,
r1 measures the relationship between yt and yt−1, r2 measures the relationship between yt and yt−2, and so on.

4.1 Lagged Scatterplots

With lagged scatterplots correlations of consecutive observations are possible. Original time series values will be plotted against a time-shifted version. e.g. xi, xi-k. The average over a realization will be exchanged with the average over time. We assume that for stationary time series, the correlation depends only on the lag.

Observations:

The plot with lag 1 shows a inear pattern, which indicates a correlation between subsequent data points. A ag 10 however results in a rather unspecific scatter plot.

Formula:

4.2 Plug-in Estimator

Plug-in estimation is the standard approach for computing autocorrelations in time series analysis. It is better than the lagged scatterplot idea.

Calculated with the acf() function in R.

Formula

4.2.1 Lagged scatterplot estimator versus Plug-in estimator for autocorrelation

The major differences ## Autocorrelation ACF

The ACF plot is also useful for identifying non-stationary time series. For a stationary time series, the ACF will drop to zero relatively quickly, while the ACF of non-stationary data decreases slowly. Also, for non-stationary data, the value of r1 is often large and positive.

  • For time series shorter than 100 elements, the plug-in estimator p(k) is biased

Concept: The correlation p(k)^2 squared corresponds to the percentage of variability explained by the linear association between Xt and Xt+k -> R^2.

Variance: Var(Xt) = E[(Xt-mean)^2] = Cov(Xt,Xt) with mean = E[Xt]

4.2.2 ACF Characteristics

4.2.3 Patterns in ACF of non-stationary time series

4.2.4 Trend in ACF plots

When data have a trend, the autocorrelations for small lags tend to be large and positive because observations nearby in time are also nearby in size. So the ACF of trended time series tend to have positive values that slowly decrease as the lags increase.

4.2.5 Season in ACF plots

When data are seasonal, the autocorrelations will be larger for the seasonal lags (at multiples of the seasonal frequency) than for other lags.

When data are both trended and seasonal, you see a combination of these effects.

4.3 Partial Autocorrelation PACF

The ACF measures the „simple“ dependence between and , whereas the PACF measures that dependence in a „multiple“ fashion. Given a time series , the partial autocorrelation of lag , is the autocorrelation between and with the linear dependence of through to removed. It is a conditional correlation of the random variables A, B under condition C.

4.4 Ljung-Box Test

The Ljung-Box approach tests the Null-Hypothesis that a number of autocorrelation coefficients are simultaneously equal to 0. Meaning, the Ljung-Box test tests for significant autocorrelation in a time series.

The Ljung–Box test may be defined as:

H0: The data are independently distributed (p-value > 0.05)

Ha: The data are not independently distributed; they exhibit serial correlation. (p-value < 0.05)

Formula and R-code:

Outcome:

The p-value shows that the series is not independently distributed. H0 rejected. Ha accepted.

4.5 White Noise

Time series that show no autocorrelation are called white noise. For white noise series, we expect each autocorrelation to be close to zero. Of course, they will not be exactly equal to zero as there is some random variation. For a white noise series, we expect 95% of the spikes in the ACF to lie within ±2/√T where T is the length of the time series. It is common to plot these bounds on a graph of the ACF. If one or more large spikes are outside these bounds, or if substantially more than 5% of spikes are outside these bounds, then the series is probably not white noise.

4.6 STL Decomposition

The Seasonal-Trend Decomposition Procedure by Loess • is an iterative, non-parametric smoothing algorithm • yields a simultaneous estimation of trend and seasonal effect -> robust! + very simple to apply + very illustrative and quick + seasonal effect can be constant or smoothly varying - model free, extrapolation and forecasting is difficult

-> Good method for „having a quick look at the data“

5 Modeling

Exponential smoothing and ARIMA models are the two most widely used approaches to time series forecasting, and provide complementary approaches to the problem. While exponential smoothing models are based on a description of the trend and seasonality in the data, ARIMA models aim to describe the autocorrelations in the data.

Purpose of modelling: - to explain hidden structures - to predict future values

Challenge: - What model is the most suitable model?

5.1 AR(p) - Autoregressive Models

AR Models are exclusively used for the modeling of STATIONARY processes and base on the idea that the current value of the series can be explained by a linear combination of the p previous values **Xt-p + et (et = innovation = ‘completely independent term’). Meaning, we forecast the variable of interest using a linear combination of past values of the variable. The term autoregression indicates that it is a regression of the variable against itself.

For a AR(1) process its definition it’s:

That means that the random walk is a special case of an AR(1) with a1 = 1.

An AR(p) stochastic process is weakly stationary, if all complex roots of the characteristic polynomial exceed 1 in absolute value.

R-Code: polyroot() function -> computes the complex roots of a given polynomial.

Changing the parameters ϕ1,…,ϕp results in different time series patterns.

5.1.1 Fitting AR(p)-Models

1) Model identification - is an AR process suitable, and what is the p? - will be based on ACF/PACF-Analysis

2) Parameter Estimation - Regression approach - Yule-Walker-Equations - and more (MLE, Burg-Algorithm)

3) Residual Analysis - to be discussed

5.1.2 Autocorrelation of AR(p) processes

5.2 Parameter identification

The following methods are equal for theoretical models and yield similar results on realizations. The models are sensitive to outliers and work best on data that follows a normal (gaussian) distribution.

5.2.1 OLS - Ordinary Least Square

Procedure: 1) estimate the global mean and determine xt = yt - m 2) carry out regression on xt without intercept; resulting parameters are a1, a2, … , a(p) 3) estimate the standard deviation (sd) of the innovation from the standard deviation of the residual

R-code: ar.ols(data, order = p)

5.2.2 Yule Walker Equations

These equations are used for fitting an AR(p)-model:

  1. estimate the ACF from a time series
  2. plug-in the estimates into the Yule-Walker-Equations
  3. the solution are the AR(p)-coefficients

Note: the model coefficients and the ACF values are interlaced

5.2.3 Burg’ Equation’s Algorithm

Another fitting cost function that exploit (erschliessen) the first p function values.

R-code: ar.burg(data, order.max=p)

5.2.4 MLE - Maximum Likelihood Estimator

Determines the parameter based on an optimization of the maximum likelihood function.

R-code: arima(data, order=c(2,0,0))

5.2.5 AIC - Akaike Information Criterion

Estimates the relative amount of information list by the considered model. Based on the idea, that less information a model loses, the higher the quality of that model.

5.3 MA(q)

5.4 ARMA (p,q)

AR and MA together gives the following formula:

5.5 ARIMA

ARIMA models are non-stationary ARMA models (I = Integration) and aim to describe the autocorrelations in the data.

5.6 SARIMA

Seasonal ARIMA model

5.7 ETS vs ARIMA

5.8 ARCH/GARCH Models

6 Forecasting

7 Panel Data

8 Data sets and Exercises

8.1 Hstart Data Set

Using the data hstart.dat, we illustrate various methods for descriptive decomposition and elimination of trends. The data contains monthly data on the start of residential construction in the USA within the time frame of January 1966 to December 1974. The data have undergone some transformation unknown to us (perhaps an index over some baseline value has been calculated, or perhaps the data are to be read as x · 10^? construction permits).

hstart <- read.table('https://stat.ethz.ch/Teaching/Datasets/WBL/hstart.dat')
hstart.ts <- ts(hstart[,1], start = c(1966,1), end = c(1974,12), frequency = 12)

Make a time series plot. Is this a stationary time series? If not, what kind of nonstationarity is evident?

plot(hstart.ts, lwd = 2, main = 'Residential construction in the USA')

  • not stationary
  • trend
  • seasonal component
  • decomposition possible into trend, seasonal effect and remainder:

  1. Differences Try to remove the trend and seasonal effects by computing differences. After removing seasonal effects, choose some linear trend elimination method from the course notes and plot the outcome.

R hint: H.y <- diff(hstart, lag=12) # time series differences Yt = Xt − Xt−k, lag = k H.z <- diff(H.y, lag=…) par(mfrow=c(2,1)) plot(H.y) plot(H.z)

  • Remove seasonal effects by differencing with lag = 12
hstart.season.diff <- diff(hstart.ts, lag = 12)

hstart.trend.diff <- diff(hstart.season.diff, differences = 1)
plot(hstart.trend.diff, main = "Time series plot after removing the seasonal effect and the trend", lwd = 2)

Code in 2 steps as mentioned above in one code line:

h <- diff(diff(hstart.ts, lag = 12))
plot(h)

8.1.1 Modelling using a parametric model

Decompose the time series into the components using a parametric model.

Plot the time series, including fitted values, and comment on any differences. Choose the order of the polynomial according to how good it fits the real data. Compare the orders 3, 4 and 6.

Time <- 1:length(hstart.ts)
Time2 <- Time/12+1966
Months <- factor(rep(month.name, length(hstart.ts)/12), levels=month.name)
H.lm6 <- lm(hstart.ts ~ Months + Time + I(Time^2) + I(Time^3)+ I(Time^4) + I(Time^5) + I(Time^6))
H.lm3 <- lm(hstart.ts ~ Months + Time + I(Time^2) + I(Time^3))
H.lm4 <- lm(hstart.ts ~ Months + Time + I(Time^2) + I(Time^3)+I(Time^4))
H.fit6 <- ts(fitted(H.lm6), start=1966, freq=12)
H.fit3 <- ts(fitted(H.lm3), start=1966, freq=12)
H.fit4 <- ts(fitted(H.lm4), start=1966, freq=12)
plot(hstart.ts, lwd = 2)
lines(H.fit6, col = 'red', lty = 2)
lines(H.fit3, col = 'purple', lty = 2)
lines(H.fit4, col = 'green', lty = 2)

c) STL decomposition Decompose the time series in trend, seasonal component and remainder using the non-parametric STL method, and add data from this decomposition to the plot from part a).

R hint: The decomposition is made using H.stl <- stl(hstart, s.window=“periodic”)

Note: The smoothing parameter for the seasonal effect is chosen by means of s.window. - If s.window=“periodic”, the seasonal effect is estimated by averaging. - It is also possible to specify a value for the smoothing parameter (an odd number). Try e.g. H.stl.var <- stl(hstart, s.window = 15), and compare the result of this to H.stl. Incidentally, summary() can be used for displaying the values of window.

The trend estimation parameter can be set using t.window. Unlike s.window, this argument does have a default value (cf. the help file). Perhaps you could try to vary this parameter as well. The documentation for R and the help files give more details. Trend, seasonal component and remainder of the STL are stored in H.st\(time.series[,"trend"] H.st\)time.series[,“seasonal”] H.st$time.series[,“remainder”]

8.1.2 Decomposition of the time series with averaging a smoothing parameter:

hstart.stl <- stl(hstart.ts, s.window = 'periodic')
summary(hstart.stl)
##  Call:
##  stl(x = hstart.ts, s.window = "periodic")
## 
##  Time.series components:
##     seasonal             trend             remainder         
##  Min.   :-39.33508   Min.   : 86.47829   Min.   :-19.692517  
##  1st Qu.:-17.60032   1st Qu.:114.23674   1st Qu.: -5.720787  
##  Median :  5.17326   Median :127.66331   Median :  0.721505  
##  Mean   :  0.00000   Mean   :137.91067   Mean   :  0.083776  
##  3rd Qu.: 18.23888   3rd Qu.:166.55917   3rd Qu.:  5.622462  
##  Max.   : 29.58625   Max.   :198.46523   Max.   : 24.475451  
##  IQR:
##      STL.seasonal STL.trend STL.remainder data 
##      35.84        52.32     11.34         52.75
##    %  67.9         99.2      21.5         100.0
## 
##  Weights: all == 1
## 
##  Other components: List of 5
##  $ win  : Named num [1:3] 1081 19 13
##  $ deg  : Named int [1:3] 0 1 1
##  $ jump : Named num [1:3] 109 2 2
##  $ inner: int 2
##  $ outer: int 0
str(hstart.stl)
## List of 8
##  $ time.series: Time-Series [1:108, 1:3] from 1966 to 1975: -39.34 -38.22 5.59 27.64 29.59 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:3] "seasonal" "trend" "remainder"
##  $ weights    : num [1:108] 1 1 1 1 1 1 1 1 1 1 ...
##  $ call       : language stl(x = hstart.ts, s.window = "periodic")
##  $ win        : Named num [1:3] 1081 19 13
##   ..- attr(*, "names")= chr [1:3] "s" "t" "l"
##  $ deg        : Named int [1:3] 0 1 1
##   ..- attr(*, "names")= chr [1:3] "s" "t" "l"
##  $ jump       : Named num [1:3] 109 2 2
##   ..- attr(*, "names")= chr [1:3] "s" "t" "l"
##  $ inner      : int 2
##  $ outer      : int 0
##  - attr(*, "class")= chr "stl"
plot(hstart.stl, main = "STL with s.window= 'periodic'", lwd = 2)

In the plot, we have the original data, additive seasonal effects, the trend curve and the residuals (remainder) of the STL decomposition, drawn in order from top to bottom.

8.1.3 Decomposition of the time series by manually choosing a smoothing parameter:

hstart.stl.var <- stl(hstart.ts, s.window = 15)
summary(hstart.stl.var)
##  Call:
##  stl(x = hstart.ts, s.window = 15)
## 
##  Time.series components:
##     seasonal             trend             remainder         
##  Min.   :-40.99656   Min.   : 86.12904   Min.   :-21.661154  
##  1st Qu.:-17.70703   1st Qu.:114.64695   1st Qu.: -6.058398  
##  Median :  5.52532   Median :128.41074   Median :  1.289131  
##  Mean   :  0.00057   Mean   :137.85127   Mean   :  0.142598  
##  3rd Qu.: 19.20394   3rd Qu.:166.03389   3rd Qu.:  5.118463  
##  Max.   : 30.47459   Max.   :197.60258   Max.   : 25.067960  
##  IQR:
##      STL.seasonal STL.trend STL.remainder data 
##      36.91        51.39     11.18         52.75
##    %  70.0         97.4      21.2         100.0
## 
##  Weights: all == 1
## 
##  Other components: List of 5
##  $ win  : Named num [1:3] 15 21 13
##  $ deg  : Named int [1:3] 0 1 1
##  $ jump : Named num [1:3] 2 3 2
##  $ inner: int 2
##  $ outer: int 0
str(hstart.stl.var)
## List of 8
##  $ time.series: Time-Series [1:108, 1:3] from 1966 to 1975: -36.98 -38.01 5.33 27.52 27.46 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:3] "seasonal" "trend" "remainder"
##  $ weights    : num [1:108] 1 1 1 1 1 1 1 1 1 1 ...
##  $ call       : language stl(x = hstart.ts, s.window = 15)
##  $ win        : Named num [1:3] 15 21 13
##   ..- attr(*, "names")= chr [1:3] "s" "t" "l"
##  $ deg        : Named int [1:3] 0 1 1
##   ..- attr(*, "names")= chr [1:3] "s" "t" "l"
##  $ jump       : Named num [1:3] 2 3 2
##   ..- attr(*, "names")= chr [1:3] "s" "t" "l"
##  $ inner      : int 2
##  $ outer      : int 0
##  - attr(*, "class")= chr "stl"
plot(hstart.stl.var, main = 'STL with s.window = 15', lwd = 2)

8.1.4 Compare the averaging and manually choosen parameter for the STL:

par(mfrow = c(1, 2), cex = 0.7, mar = c(2, 2, 0.1, 0.1), oma = c(0, 3, 1.5, 0))
monthplot(hstart.stl$time.series[, "seasonal"], ylab = "")
mtext("Averaging", side = 3)
mtext("Seasonal component", side = 2, line = 2, cex = 0.8)
monthplot(hstart.stl.var$time.series[, "seasonal"], ylab = "")
mtext("loess window 15", side = 3)

Answer: We see that the monthly effects of hstart.stl remain constant over the years, whereas those of hstart.stl.var vary: the effects of May to August increase with the years, while the effects of January, September and December decrease.

hstart.filter <- filter(hstart.ts, c(1,rep(2,11), 1)/24)
plot(hstart.filter)

Answer: The trend line from the special filter is somewhat less smoooth then the trend line stemming from the STL decomposition. However, the smoothness of the STL trend line can be steered by the smoothness parameter t.window of the function stl().

8.1.5 Plot the original data and the STL-Decomposition (trend + season without remainder) vs. time.

plot(hstart.ts, lwd = 2.5)
lines(hstart.ts - hstart.stl$time.series[,'remainder'],lwd = 2, lty = 'dashed',main = "Data vs. STL", col = 'red')
legend('topleft', legend = c('Data', 'STL'), col = c(1,2), lty = c(1,2), lwd = c(2.5,2))

Answer: The non-parametric model STL fits the data better than the parametric one does.

8.1.6 Plot the remainder only from the STL-Decomposition vs. time.

plot(ts(hstart.stl$time.series[,'remainder'], start = c(1966,1), frequency = 12),lty = 1, lwd = 2, ylab='Residuals', main = 'Residual plot', col = 1)
lines(ts(resid(H.lm6), start = 1966, frequency = 12), lty=3, col =2, lwd = 1.5)
abline(h=0)
legend('topleft', legend=c("STL","Parametric model 6"), lty=c(1,3), col=c(1,2))

No systematic behavior is visible in the Residual plot and hence it could be concluded that the combination of the estimated trend and the seasonal effect fits the data reasonably well. the discrepancies between the two models are quite visible. Over certain intervals of time, the residuals of the parametric model differ from zero in a systematic way.

8.1.7 Calculating the decomposition

hstart.decomp <- decompose(hstart.ts, type = 'additive')
plot(hstart.decomp, lwd = 2)

8.1.8 Plot the time series and three different estimates of the rend into one plot:

plot(hstart.ts, lwd = 2)
lines(hstart.stl$time.series[,'trend'], lty =1, col = 'green', lwd = 2)
lines(hstart.filter, lty = 2, col = 'purple', lwd = 2)
lines(hstart.decomp$trend, lty = 3, col = 'red', lwd = 2)
legend('topleft', legend = c('Time Series', 'STL trend', 'Filter', 'Decompose'), lty = c(1,2,3), bty = 'n', col = c('black', 'green', 'purple', 'red'))

We reconsider the data set from exercise 1.2 about residential construction in the USA from January 1966 to December 1974.

hstart <- read.table('https://stat.ethz.ch/Teaching/Datasets/WBL/hstart.dat')
head(hstart)
##      V1
## 1  81.9
## 2  79.0
## 3 122.4
## 4 143.0
## 5 133.9
## 6 123.5
ts.hstart <- ts(hstart[,1], start = c(1966,1), frequency = 12)
ts.hstart
##        Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec
## 1966  81.9  79.0 122.4 143.0 133.9 123.5 100.0 103.7  91.9  79.1  75.1  62.3
## 1967  61.7  63.2  92.9 115.9 134.2 131.6 126.1 130.2 125.8 137.0 120.2  83.1
## 1968  82.7  87.2 128.6 164.9 144.5 142.5 142.3 141.0 139.5 143.3 129.5  99.3
## 1969 105.8  94.6 135.6 159.9 157.7 150.5 126.5 127.5 132.9 125.8  97.4  85.3
## 1970  69.2  77.2 117.8 130.6 127.3 141.9 143.5 131.5 133.8 143.8 128.3 124.1
## 1971 114.8 104.6 169.3 203.6 203.5 196.8 197.0 205.9 175.6 181.7 176.4 155.3
## 1972 150.9 153.6 205.8 213.2 227.9 226.2 207.5 231.0 204.4 218.2 187.1 152.7
## 1973 147.3 139.5 201.1 205.4 234.2 203.4 203.2 199.9 148.9 149.5 134.6  90.6
## 1974  86.2 109.6 127.2 160.9 149.9 149.5 127.2 114.0  99.6  97.2  75.1  54.9

a) Decompose the time series in trend, seasonal component and remainder using the non-parametric STL method, and plot the results.

hstart.stl <- stl(ts.hstart, s.window = 'per')

plot(hstart.stl, lwd = 1.5, col = 'red')

b) The special filter Yt = 1/24 (Xt−6 + 2Xt−5 + … + 2Xt + … + Xt+6) can be used for computing a trend estimate.

weights <- c(1, rep(2,11), 1)/24 # total of 13 values because t-6 unti t+6
weights
##  [1] 0.04166667 0.08333333 0.08333333 0.08333333 0.08333333 0.08333333
##  [7] 0.08333333 0.08333333 0.08333333 0.08333333 0.08333333 0.08333333
## [13] 0.04166667
hstart.filtertrend <- filter(ts.hstart, filter = weights, sides = 2)
hstart.stl <- stl(ts.hstart, s.window = 'per')

Plot this, the STL trend and the data in a single plot. What are the differences between the two methods?

plot(ts.hstart, lwd = 2, main = 'hstart data  with trend lines (filter and stl)')
lines(hstart.filtertrend, col = 'red', lwd = 2)
lines(hstart.stl$time.series[,'trend'], col = 'green', lwd = 3, lty = 2)
legend('topleft', legend = c('data', 'filter', 'stl'), lty = 1:2, col = c('black', 'red', 'green'), cex = 0.8)

c) Try to remove the trend and seasonal effects by computing differences. After removing seasonal effects, choose some linear trend elimination method and plot the outcome.

hstart.adj.season <- diff(ts.hstart, lag = 12, differences = 1)
plot(hstart.adj.season, lwd = 2, main = 'hstart data after removing seasonal pattern')

A trend is still visible, eliminate it with backward differencing once more to produce a stationary time series.

hstart.adj.trend <- diff(hstart.adj.season, lag = 1)
plot(hstart.adj.trend, lwd = 2, main = 'hstart dataset after eliminating season and trend effects')

8.1.9 Backshift differencing

Remove the linear trend by applying backward differencing on timeseries created from the following models:

  1. Xt ∼ 0.5t + 1 + Ut, where Ut ∼ U(−1, 1)
t <- seq(1,100,length =100)
Ut <- runif(100,-1,1)
data <- 0.5 * t + 1 + Ut
ts.data <- ts(data)

plot(ts.data)

yt <- diff(ts.data)

#comparison of data length
length(ts.data)
## [1] 100
length(yt)
## [1] 99
plot(yt, lwd = 1.5, main = 'Xt ∼ 0.5t + 1 + Ut, where Ut ∼ U(−1, 1)')

  1. Xt ∼ 2t^2 + 3t − 1 + Ut, where Ut ∼ U(−200, 200)
t <- seq(1, 100, length=100)
Ut <- runif(100, -200, 200)
data <- 2 * t^2 + 3 * t - 1 + Ut
ts.data <- ts(data)

plot(ts.data)

yt <- diff(ts.data, differences = 2)

#comparison of length
length(ts.data)
## [1] 100
length(yt)
## [1] 98
plot(yt, lwd = 1.5, main = 'Xt ∼ 2t^2 + 3t − 1 + Ut, where Ut ∼ U(−200, 200)')

8.2 CO2 Data set

To test ideas and algorithms, R comes with built-in data sets. The data used in this exercise is called co2 and contains atmospheric concentrations of CO2 in parts per million.

data(co2)
class(co2)
## [1] "ts"
plot(co2, main = 'co2 data', lwd = 1.5)

Use backward differencing on the co2 data to abolish the seasonality effect. Figure out what value for the lag is to choose for an optimum reduction of the seasonality? What happens if you choose other values for the lag?

co2.adj.season <- diff(co2, differences = 1, lag = 12)
plot(co2.adj.season, lwd = 1.5, main = 'co2 dataset after eliminating seasonal pattern with diff = 1, lag =12')

co2.adj.bd <- diff(co2.adj.season, lag = 1, differences = 1)
plot(co2.adj.bd, lwd = 1.5, main = 'co2 dataset after eliminating seasonal pattern with diff = 1, lag =12 and diff = 1, lag = 1')

Once again have a look at the co2 data set. In this exercise you should try to decompose the series into trend, seasonality and random parts using a linear additive filter. For the seasonal part, the hints below should help you calculate the means over the same months in different years.

8.2.1 TREND ESTIMATION

Attention: if the freqeuncy is even, we have to split take half of the border months if we want to end up with a symmetric window.

co2.data <- co2
weights <- c(0.5, rep(1,11), 0.5)/12
trend <- filter(co2.data, filter = weights, sides = 2)
plot(co2.data, lwd= 2, main = 'co2 decomposition with linear additive filter - TREND ESTIMATION')
lines(trend, col= 'red', lwd = 2.5)

8.2.2 SEASONALITY ESTIMATION

weights <- c(0.5, rep(1,11), 0.5)/12
trend.est <- filter(co2, filter = weights, sides = 2)
trend.adj <- co2 - trend.est
month <- factor(rep(1:12, 39))

season.est <- tapply(trend.adj, month, mean, na.rm = T)
plot(season.est, type = 'h', xlab = 'Month', main = 'co2 decomposition with linear additive filter - SEASON ESTIMATION', col = 'red')
abline(h = 0)

For a combined solution have a look at the decomopose() function:

co2.data <- co2
co2.data.decomp <- decompose(co2.data)
plot(co2.data.decomp, lwd = 1.5, col = 'red')

8.3 Keratin Data set

The performance of a machine that analyses the creatine content of human muscular tissue is to be investigated using 157 known samples. These samples are given to the machine one at a time for it to determine their creatine content. In this exercise, we would like to check whether it is operating correctly, i.e. the measured values does not depend on the measuring instance. We focus only on the variable “gehalt”(content) in the data.

keratin <- read.table('https://stat.ethz.ch/Teaching/Datasets/WBL/kreatin.dat')
keratin.ts <- ts(keratin[,2], start = 1, frequency = 1)

a) Which stochastic model should this series of data follow if the machine is working correctly?

plot(keratin.ts, lwd = 2, main = 'Ceratine level')

Use the time series plot, the autocorrelations and the partial autocorrelations to determine whether or not these data fit the ideal model found in Part a).

ACF

keratin.acf <- acf(keratin.ts, plot = T)

PACF

keratin.pacf <- pacf(keratin.ts, plot = T)

8.4 Matching Correlograms

Below you find the plots and the correlograms of four datasets. The correlograms have been permutated. Please find for each data sets (A-D) the appropriate corellogram (1 - 4).
plots and correlograms

plots and correlograms

Solution: A - 2 B - 3 C - 1 D - 4

8.5 Chocolate, beer, electricity Data set

Let us now consider the electricity production of Australia in GWh in the period from January 1958 to December 1990. You may download the data from http://stat.ethz.ch/Teaching/Datasets/WBL/cbe.dat.

The aim of this exercise is to compare the effect of different algorithms to decompose a time series representation in trend, seasonality and remainder by means of their (partial) autocorrelation function.

data <- read.table('cbe.dat', header = T)
head(data)
##   choc beer elec
## 1 1451 96.3 1497
## 2 2037 84.4 1463
## 3 2477 91.2 1648
## 4 2785 81.9 1595
## 5 2994 80.5 1777
## 6 2681 70.4 1824
ts.elec <- ts(data[,'elec'], start = c(1958,1), end = c(1990, 12), freq= 12)
head(ts.elec)
## [1] 1497 1463 1648 1595 1777 1824

a) Start by considering the plot of the time series. Why is not meaningful to interpret the correlogram of this time series? Explain in a few sentences.

plot(ts.elec, lwd = 1.5)

Answer: The time series displays a clear trend and a clear seasonality effect. Therefore, the fundamental assumption of the autocorrelation function analysis (stationarity) is violated.

b) Decompose the time series into trend, seasonal component and remainder using the R function decompose(), which performs the decomposition with moving averages.

elec.decomp <- decompose(ts.elec, type = 'multiplicative')
plot(elec.decomp)

Plot the remainder (random) and its correlogram (acf) and interpret the plots in a few sentences. The function employs a filter to estimate the trend; therefore, the first and the last few entries of the decomposition are not defined, i.e. the have the value NA in R. To prevent issues of R, the parameter na.action=na.pass (asking R to ignore NA entries) has to be employed.

plot(elec.decomp$random, ylab = 'remainder')

acf(elec.decomp$random, na.action = na.pass, plot= T, ylim = c(-1,1))

Answer: The trend is removed from the time series, but we still see a periodic behaviour in the computed remainder. The period of one year equals that of the seasonal component of the original time series. These effects are also visible in the damped oscillatory behaviour of the correlogram.

c) Decompose the log-transformed time series using the R function stl(). Estimate the seasonal effect once by averaging over all years (parameter s.window = “periodic”) and once by choosing an appropriate smoothing window (parameter s.window = …). Recall that the window length has to be odd. An appropriate smoothing window may be determined by the R-function monthplot(). For both estimation approaches (averaging and smoothing window), plot the remainder and its correlogram, and comment on the plots.

elec.stl <- stl(log(ts.elec), s.window = 'per')
plot(elec.stl)

plot(elec.stl$time.series[,'remainder'], ylab = 'remainder')

acf(elec.stl$time.series[,'remainder'], plot= T, ylim =c(-1,1))

Answer: The result are compatible with the those of decompose(). The remainder displays still a periodic behaviour with a period of one year.

a.05 <- stl(ts.elec, s.window = 5)
monthplot(a.05)

a.13 <- stl(ts.elec, s.window= 13)
monthplot(a.13)

a.16 <- stl(ts.elec, s.window = 16)
monthplot(a.16)

a.19 <- stl(ts.elec, s.window = 19)
monthplot(a.19)

a.23 <- stl(ts.elec, s.window = 23)
monthplot(a.23)

Using a smoothing window of 19 (the size is chosen based on the monthplot; other values of similar magnitude yield comparable results), we find the following picture:

elec.stl.w19 <- stl(log(ts.elec), s.window = 19)
plot(elec.stl.w19)

plot(elec.stl.w19$time.series[,'remainder'], ylab= 'remainder')

acf(elec.stl.w19$time.series, plot = T, ylim = c(-1,1))

acf(elec.stl.w19$time.series[,'remainder'], plot = T, ylim = c(-1,1))

Findings: Now, the periodicity is removed much better from the time series. The autocorrelation function of the remainder at lag 1 is not significant any more. The difference between the averaging and the smoothing window approach goes along with different concepts behind the two methods. In the averaging approach, we model the seasonal component as truly periodic and explain the slight deviations from this fitted component as a stationary random process, which in our case has the same periodicity as the underlying seasonal effect.

In the smoothing window approach, we allow the seasonal component to vary (slightly) over time, which enables to eliminate the effects with the frequency of the seasonal 4 component in the remainder. On the other hand, this approach is more prone to overfitting, since the slight deviations from a truly periodic behaviour in the seasonal component are interpreted as deterministic".

d) Explain why you used the parameter type = “multiplicative” in Task b), and why you log-transformed the time series before performing an stl()decomposition in Task c).

The seasonal oscillations of the time series (see plot in Task a)) are approximately proportional to the trend. Therefore, a multiplicative model is appropriate for describing the time series: Xt = mt * St * Rt A log-transformation transforms a multiplicative into an additive model: log(Xt) = log(mt) + log(St) + log(Rt)

The R function decompose() can handle multiplicative models directly, whereas stl() requires the reduction to an additive model

e) As a last algorithm consider the differencing approach. Choose a lag of 1 and 12(months) to eliminate trend and periodic structures.

elec.diff12 <- diff(ts.elec, lag = 12)
elec.diff12.1 <- diff(elec.diff12, lag = 1)

Plot the resulting time series and autocorrelation function. Compare the results to the previous methods.

plot(elec.diff12.1)

acf(elec.diff12.1, ylim = c(-1,1))

Findings: The acf looks quite different. We have strong negative autocorrelation at lag 1 and 12. Therefore, the model is probably not linear. This method is not appropriate here.

8.6 Plug-in estimator

In this exercise, we will calculate the lagged scatter plot and the plug-in estimator without employing the internal R function.

  1. Write a function to calculate the lagged scatter plot estimator for the autocorrelation. For this, you may extend the code given in the lecture notes.
lagged <- function(data) {
n <- length(data)
lagCorrel <- rep(0, n)
for (i in 0:(n - 1)) {
lagCorrel[i + 1] = cor(data[1:(n - i)], data[(i +
1):n])
}
# return value
lagCorrel
}
  1. Develop a function to calculate the plug-in estimator for the autocorrelation. For the implementation of the plug-in estimator we calculate the individual defining γ(k) factors first and then assign the fraction of γ(k)/γ(0) to the estimator ρˆ(k):
plugIn <- function(data) {
n <- length(data)
meanData <- mean(data)
deltaX <- data - meanData
gamma0 = sum(deltaX * deltaX)/n
plugIn <- rep(0, n)
for (i in 0:(n - 1)) {
gammaK = sum(deltaX[(i + 1):n] * deltaX[1:(n -
i)])/n
plugIn[i + 1] <- gammaK/gamma0
}
plugIn
}
  1. Calculate the two estimates for the beer and the chicken dataset. The beer and the chicken dataset is contained in the “fma”package. In case it is not already loaded, one can load it with the command library(fma).
library(fma)
## Warning: package 'fma' was built under R version 3.6.3
## Warning: package 'forecast' was built under R version 3.6.3
lagChicken <- lagged(beer)
plugChicken <- plugIn(beer)
plot(lagChicken, type = "l", ylab = "ACF Estimator")
lines(plugChicken, col = "red")

lagChicken <- lagged(chicken)
plugChicken <- plugIn(chicken)
plot(lagChicken, type = "l", ylab = "ACF Estimator")
lines(plugChicken, col = "red")

The beer data sets displays a high agreement between the two calculation methods, whereas the chicken data sets displays a strong disagreement between the two data sets.

8.7 AR(1) process

In this exercise, we would like to investigate the properties of an AR(1) process.

  1. Simulate a realisation of the process Xt = 0.8 · Xt−1 + et with et an innovation process of length 1000.

  2. Calculate the theoretical autocorrelation function and the plug-in estimator of the autocorrelation of the simulation results in a) and plot both curves for lags from 0 to 100.

  3. What is the functional dependence of the theoretical autocorrelation function on the lag k and α1 = 0.8?

  4. Now compare the theoretical partical autocorrelation function with the estimated version for the simulated process. Which particularity do you observe for the two representations?

8.8 Data set with ts1 and ts2

In this exercise, we consider two time series ts1 and ts2, which putatively were created by an AR process. You may download the data from http://stat.ethz.ch/Teaching/Datasets/WBL/ts_S3_A2.dat

  1. Visualise both time series. Are both time series stationary? What is their mean?

  2. Consider the (partial) autocorrelation function and decide whether the two time series can be generated by an AR process. If yes, what is the order of the respective AR process? Hint: The partial auto correlation function of an AR(p) process displays a sudden drop for lags larger than p.

8.9 AR(3) model with coefficients

Let us consider the AR(3) model with coefficients α1 = 0.6, α2 = −0.5 and α3 = 0.4: Xt = 0.6 · Xt−1 − 0.5 · Xt−2 + 0.4 · Xt−3

  1. Simulate one realisation of length 50 of the time series and plot it. Would you assume that this time series is stationary?

  2. Calculate the estimated (partial) autocorrelation function and compare it to the theoretical function. Hint: Compare exercise 3.3 for hints.

  3. Preview to week 5: Calculate the roots of the polynomial Φ(z) = 1 − α1 · z − α2 · z^2 − α3 · z^3 with the R function polyroot. What do you observe for the absolute value of the roots?